Search CORE

1,412 research outputs found

An extended Stein-type covariance identity for the Pearson family with applications to lower variance bounds

Author: Afendras G.
Papadatos N.
Papathanasiou V.
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 03/05/2011
Field of study

For an absolutely continuous (integer-valued) r.v.

X

of the Pearson (Ord) family, we show that, under natural moment conditions, a Stein-type covariance identity of order

k

holds (cf. [Goldstein and Reinert, J. Theoret. Probab. 18 (2005) 237--260]). This identity is closely related to the corresponding sequence of orthogonal polynomials, obtained by a Rodrigues-type formula, and provides convenient expressions for the Fourier coefficients of an arbitrary function. Application of the covariance identity yields some novel expressions for the corresponding lower variance bounds for a function of the r.v.

X

, expressions that seem to be known only in particular cases (for the Normal, see [Houdr\'{e} and Kagan, J. Theoret. Probab. 8 (1995) 23--30]; see also [Houdr\'{e} and P\'{e}rez-Abreu, Ann. Probab. 23 (1995) 400--419] for corresponding results related to the Wiener and Poisson processes). Some applications are also given.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ282 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

Crossref

Strengthened Chernoff-type variance bounds

Author: Afendras G.
Papadatos N.
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 05/02/2014
Field of study

Let

X

be an absolutely continuous random variable from the integrated Pearson family and assume that

X

has finite moments of any order. Using some properties of the associated orthonormal polynomial system, we provide a class of strengthened Chernoff-type variance bounds.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ484 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

Crossref

Optimal Piecewise Linear Regression Algorithm for QSAR Modelling

Author: Cardoso-Silva Jonathan
Papadatos George
Papageorgiou Lazaros G.
Tsoka Sophia
Publication venue: 'Wiley'
Publication date: 24/09/2018
Field of study

Quantitative Structure‐Activity Relationship (QSAR) models have been successfully applied to lead optimisation, virtual screening and other areas of drug discovery over the years. Recent studies, however, have focused on the development of models that are predictive but often not interpretable. In this article, we propose the application of a piecewise linear regression algorithm, OPLRAreg, to develop both predictive and interpretable QSAR models. The algorithm determines a feature to best separate the data into regions and identifies linear equations to predict the outcome variable in each region. A regularisation term is introduced to prevent overfitting problems and implicitly selects the most informative features. As OPLRAreg is based on mathematical programming, a flexible and transparent representation for optimisation problems, the algorithm also permits customised constraints to be easily added to the model. The proposed algorithm is presented as a more interpretable alternative to other commonly used machine learning algorithms and has shown comparable predictive accuracy to Random Forest, Support Vector Machine and Random Generalised Linear Model on tests with five QSAR data sets compiled from the ChEMBL database

UCL Discovery

King's Research Portal

Development of text mining tools for information retrieval from patents

Author: A Faro
A Lourenço
C Wu
G Papadatos
KB Cohen
M Krallinger
MT Latimer
R Klinger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/06/2017
Field of study

Biomedical literature is composed of an ever increasing number of publications in natural language. Patents are a relevant fraction of those, being important sources of information due to all the curated data from the granting process. However, their unstructured data turns the search of information a challenging task. To surpass that, Biomedical text mining (BioTM) creates methodologies to search and structure that data. Several BioTM techniques can be applied to patents. From those, Information Retrieval is the process where relevant data is obtained from collections of documents. In this work, a patent pipeline was developed and integrated intoFEDER -Federación Española de Enfermedades Raras(NORTE-01-0145-FEDER-000004)info:eu-repo/semantics/publishedVersio

Universidade do Minho: RepositoriUM

Crossref

Linear Estimation of Location and Scale Parameters Using Partial Maxima

Author: A Prèkopa
AK Gupta
BC Arnold
BC Arnold
D Sengupta
E Parzen
EH Lloyd
EL Lehmann
FA Graybill
FJ Samaniego
FJ Samaniego
G Hofmann
H Chernoff
HA David
HA David
J Galambos
J Shao
K Sarkadi
M Bagnoli
M Berger
M Burkschat
MC Jones
N Balakrishnan
N Papadatos
Nickos Papadatos
NR Mann
P Tryfos
R Pyke
R Pyke
RL Smith
SI Resnick
SM Stigler
W Hoeffding
Z Bai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/09/2010
Field of study

Consider an i.i.d. sample X^*_1,X^*_2,...,X^*_n from a location-scale family, and assume that the only available observations consist of the partial maxima (or minima)sequence, X^*_{1:1},X^*_{2:2},...,X^*_{n:n}, where X^*_{j:j}=max{X^*_1,...,X^*_j}. This kind of truncation appears in several circumstances, including best performances in athletics events. In the case of partial maxima, the form of the BLUEs (best linear unbiased estimators) is quite similar to the form of the well-known Lloyd's (1952, Least-squares estimation of location and scale parameters using order statistics, Biometrika, vol. 39, pp. 88-95) BLUEs, based on (the sufficient sample of) order statistics, but, in contrast to the classical case, their consistency is no longer obvious. The present paper is mainly concerned with the scale parameter, showing that the variance of the partial maxima BLUE is at most of order O(1/log n), for a wide class of distributions.Comment: This article is devoted to the memory of my six-years-old, little daughter, Dionyssia, who leaved us on August 25, 2010, at Cephalonia isl. (26 pages, to appear in Metrika

arXiv.org e-Print Archive

Crossref

Subclinical VZV reactivation in immunocompetent children hospitalized in the ICU associated with prolonged fever duration*

Author: Breuer J.
Critselis E.
Lockwood J.
Papadatos J.
Papaevangelou V.
Papaloukas O.
Papassotiriou I.
Quinlivan M.
Sideri G.
Publication venue: European Society of Clinical Infectious Diseases. Published by Elsevier Ltd.
Publication date: 18/12/2012
Field of study

AbstractA prospective observational study was conducted to examine whether asymptomatic VZV reactivation occurs in immunocompetent children hospitalized in an ICU and its impact on clinical outcome. A secondary aim was to test the hypothesis that vaccinated children have a lower risk of reactivation than naturally infected children. Forty immunocompetent paediatric ICU patients and healthy controls were enrolled. Patients were prospectively followed for 28 days. Clinical data were collected and varicella exposure was recorded. Admission serum levels of TNF-a, cortisol and VZV-IgG were measured. Blood and saliva samples were collected for VZV-DNA detection via real-time PCR. As a comparison, the detection of HSV-DNA was also examined. Healthy children matched for age and varicella exposure type (infection or vaccination) were also included. VZV reactivation was observed in 17% (7/39) of children. Children with VZV reactivation had extended duration of fever (OR = 1.17; 95% CI, 1.02–1.34). None of the varicella-vaccinated children or healthy controls had detectable VZV-DNA in any blood or saliva samples examined. HSV-DNA was detected in saliva from 33% of ICU children and 2.6% of healthy controls. Among children with viral reactivation, typing revealed wild-type VZV and HSV-1. In conclusion, VZV reactivation occurs in immunocompetent children under severe stress and is associated with prolonged duration of fever

Elsevier - Publisher Connector

UCL Discovery

Target identification of $\textit{Mycobacterium tuberculosis phenotypic}$ hits using a concerted chemogenomic, biophysical and structural approach

Author: Abell C
Ballell L
Barros D
Blaszczyk M
Blundell TL
Lelievre J
Mugumbate G
Overington JP
Papadatos G
Sabbah M
Silva E Costa Mendes V
Publication venue: Frontiers in Pharmacology
Publication date: 01/01/2017
Field of study

Mycobacterium phenotypic hits are a good reservoir for new chemotypes for the treatment of tuberculosis. However, the absence of defined molecular targets and modes of action could lead to failure in drug development. Therefore, a combination of ligand-based and structure-based chemogenomic approaches followed by biophysical and biochemical validation have been used to identify targets for Mycobacterium tuberculosis phenotypic hits. Our approach identified EthR and InhA as targets for several hits, with some showing dual activity against these proteins. From the 35 predicted EthR inhibitors, eight exhibited an IC50 below 50 μM against M. tuberculosis EthR and three were confirmed to be also simultaneously active against InhA. Further hit validation was performed using X-ray crystallography yielding eight new crystal structures of EthR inhibitors. Although the EthR inhibitors attain their activity against M. tuberculosis by hitting yet undefined targets, these results provide new lead compounds that could be further developed to be used to potentiate the effect of EthA activated pro-drugs, such as ethionamide, thus enhancing their bactericidal effect.GM is grateful to the European Molecular Biology Laboratory and Marie Sklodowska-Curie Actions for funding this work. VM and MB acknowledge Bill & Melinda Gates Foundation [subcontract by the Foundation for the National Institutes of Health (NIH)] (OPP1024021). VM and MS acknowledge the European Community’s Seventh Framework Programme [grant number 260872]. GP would like to acknowledge the Wellcome Trust and the European Molecular Biology Laboratory for funding. JPO was funded by the member nation states of the European Molecular Biology Laboratory. TLB acknowledges The Wellcome Trust for funding and support (grant number 200814/Z/16/Z)

Frontiers - Publisher Connector

Apollo (Cambridge)

A document classifier for medicinal chemistry publications trained on the ChEMBL corpus

Author: Croset S.
Overington J.P.
Papadatos G.
Santos S.
Trubian S.
Westen G.J.P van
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background The large increase in the number of scientific publications has fuelled a need for semi- and fully automated text mining approaches in order to assist in the triage process, both for individual scientists and also for larger-scale data extraction and curation into public databases. Here, we introduce a document classifier, which is able to successfully distinguish between publications that are ‘ChEMBL-like’ (i.e. related to small molecule drug discovery and likely to contain quantitative bioactivity data) and those that are not. The unprecedented size of the medicinal chemistry literature collection, coupled with the advantage of manual curation and mapping to chemistry and biology make the ChEMBL corpus a unique resource for text mining. Results The method has been implemented as a data protocol/workflow for both Pipeline Pilot (version 8.5) and KNIME (version 2.9) respectively. Both workflows and models are freely available at: ftp://ftp.ebi.ac.uk/pub/databases/chembl/text-mining webcite. These can be readily modified to include additional keyword constraints to further focus searches. Conclusions Large-scale machine learning document classification was shown to be very robust and flexible for this particular application, as illustrated in four distinct text-mining-based use cases. The models are readily available on two data workflow platforms, which we believe will allow the majority of the scientific community to apply them to their own data.FWN – Publicaties zonder aanstelling Universiteit Leide

Springer - Publisher Connector

PubMed Central

Leiden University Scholary Publications

Evaluation of machine-learning methods for ligand-based virtual screening

Author: A Bender
A Bender
A Bender
A Ormerod
A Ormerod
AE Klon
AM Capelli
AR Leach
B Chen
Beining Chen
C Williams
D Hand
D Rogers
D Wilton
DA Cosgrove
David J. Wood
DB Kitchen
DE Clark
DJ Hand
DJ Wilton
DM Hawkins
E Parzen
FL Stahura
G Harper
G Redl
G Schneider
George Papadatos
H Eckert
H Kubinyi
HM Berman
J Aitchison
J Bajorath
J Delaney
J Hert
J Hert
J Hert
JC Saeh
L Hodes
L Hodes
L Hodes
M Congreve
M Glick
M Glick
M Wagener
M Whittle
N Christianini
N Nikolova
Nikolaus Stiefl
P Constans
P Domingos
P Willett
P Willett
P Willett
P Willett
P Willett
Paulette Greenidge
Peter Willett
Q Zhang
R P Sheridan
RD Brown
RD Brown
RD Cramer
RE Carhart
RO Duda
Robert F. Harrison
S Anzali
TJ McNeany
TM Mitchell
Xiao Qing Lewell
XY Xia
YC Martin
YC Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Machine-learning methods can be used for virtual screening by analysing the structural characteristics of molecules of known (in)activity, and we here discuss the use of kernel discrimination and naive Bayesian classifier (NBC) methods for this purpose. We report a kernel method that allows the processing of molecules represented by binary, integer and real-valued descriptors, and show that it is little different in screening performance from a previously described kernel that had been developed specifically for the analysis of binary fingerprint representations of molecular structure. We then evaluate the performance of an NBC when the training-set contains only a very few active molecules. In such cases, a simpler approach based on group fusion would appear to provide superior screening performance, especially when structurally heterogeneous datasets are to be processed

Crossref

White Rose Research Online